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Abstract. We give an O(logn) bound for the expectation of the loga- 
rithm of the condition number )C(A, b, c) introduced in "Solving linear pro- 
grams with finite precision: I. Condition numbers and random programs." 
Math. Programm., 99:175-196, 2004. This bound improves the previously ex- 
isting bound, which was of 0(n). 

1 Introduction 

Consider the following linear programming problem (in standard form), 

T 

mm c x 

s.t. Ax = b (P) 
x > 0. 

Here A e R mxn , b£R m ,c£ R n , and n > m > 1. 

Assuming this problem is feasible (i.e., the set given by Ax = b, x > 0, is 
not empty) and bounded (i.e., the function x h-> c t x is bounded below on the 
feasible set), algorithms solving (P) may return an optimizer x* G R n and/or the 



"This work has been substantially funded by a grant from the Research Grants Council of the 
Hong Kong SAR (project number CityU 1085/02P). 



optimal value c T x*. Whereas these two computations are essentially equivalent in 
the presence of infinite precision, obtaining an optimizer appears to be more difficult 
if only finite precision is available. Accuracy analyses of interior-point algorithms 
for these problems have been done in [13] — for the computation of the optimal 
value — and in [6] — for the computation of an optimizer. In both cases, accuracy 
bounds (as well as complexity bounds) are given in terms of the dimensions m and 
n, as well as of the logarithm of a condition number. The bounds in both analyses 
are similar. What turns out to be different is their relevant condition numbers. 

In [13] this is Renegar's condition number C(A,b,c) which, roughly speaking, is 
the relativized inverse of the size of the smallest perturbation needed to make (P) 
either infeasible or unbounded. In [6] it is the condition number IC(A,b,c) which, 
following the same idea, is the relativized inverse of the size of the smallest pertur- 
bation needed to change the optimal basis of (P) (a detailed definition is in Section 2 
below) . 

A characteristic of these (and practicality all other) condition numbers is that 
they cannot be easily computed from the data at hand. Their computation appears 
to be at least as difficult as that of the solution for the problem whose condition 
they are measuring (see [10] for a discussion on this) and requires at least the 
same amount of precision (see [5]). A way out of this dilemma going back to the 
very beginning of condition numbers is to randomize the data and to estimate the 
expectation of its condition. Indeed, the first papers on condition are published 
independently by Turing [12] and by Goldstine and von Neumann [14], both for 
the condition of linear equation solving and in a sequel [15] to the latter the matix 
A of the input linear system was considered to be random and some probabilistic 
estimates on its condition number were derived. This approach was subsequently 
championed by Demmel [8] and Smale [11]. 

A number of probabilistic estimates for Renegar's condition number (or for a 
close relative introduced in [3]) have been obtained in the last decade [7, 2, 9]. 
The overall picture is that the contribution of the log of this condition number to 
complexity and accuracy bounds is, on the average, O(logn). In contrast with this 
satisfactory state of affairs, little is known for the condition number /C on random 
triples (A, b, c). In [4] it was shown that for these triples, conditioned to (P) being 
feasible and bounded, \ogK,(A, b, c) is 0(n) on the average but this estimate appears 
to be poor. In the present paper we improve this result and show a O(logn) bound 
(see Theorem 1 below for a precise statement). 

2 Statement of the Main Result 

In this section we fix notations, recall the definition of 1C(A, b, c), and state our main 
result. 

For any subset B of {1,2, ...,n}, denote by Ab the submatrix of A obtained by 
removing from A all the columns with index not in B. If x G R n , xb is defined 
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analogously. A set B C {1,2, ...,n} such that \B\ = m and Ab is invertible is said 
to be a basis for A. 

Let 1? be a basis. Then we may uniquely solve Abx' = b. Consider the point 
x* e R n defined by x* = for j B and x* B = x'. Clearly, Ax* = b. We say that 
x* is a primal basic solution. If, in addition, x* > 0, which is equivalent to x* B > 0, 
then we say x* is a primal basic feasible solution. 

The dual of (P), which in the sequel we denote by (D), is the following problem, 

max b T y (D) 
s.t. A T y < c. 

For any basis B, we may now uniquely solve A^y* = cb- The point y* thus 
obtained is said to be a dual basic solution. If, in addition, A T y < c, y* is said to 
be a dual basic feasible solution. 

Let B be a basis. We say that B is an optimal basis (for the pair (P-D)) if both 
the primal and dual basic solutions are feasible. In this case the points x* and y* 
above are the optimizers of (P) and (D), respectively. 

We denote by d the input data (A, b, c). We say that d is feasible when there 
exist x G M. n , x > 0, and y G M m such that = 6 and A T y < c. Let 

U = {d = (A, b,c) \ d has a unique optimal basis}. 

By definition, triples in U are feasible. 

To define conditioning, we need a norm in the space of data triples. To do so, 
we associate to each triple d = (A, b, c) G R mn + m +™ the matrix 

and we define \\d\\ to be the operator norm ||M^|| rs of considered as a linear 
map from to M m+1 . Note that this requires norms || || r and || || s in M n+1 and 

M m+1 , respectively. 

Let Yiu be the boundary of IA in ^ mn + m + n _ p or an y data input d £U, we define 
the distance to ill-posedness and the condition number for d, respectively, as follows, 



g(d) = min{ || Sd\\ : d + 5de S^} and K{d) = 




We next state our main result, after making precise the underlying probability 
model. 

Definition 1 We say that d = (A,b,c) is Gaussian, and we write d ~ iV(0, Id), 
when all entries of A, b and c are i.i.d. with standard normal distribution. 
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Theorem 1 For the || ||i2 norm we have 

E (ln/C(d) I d G W) < - ln(m + 1) + - ln(n + 1) + ln(12). 
d~Ar(o,id) 4 2 

Remark 1 The use of the || ||i2 norm in Theorem 1 is convenient but inessen- 
tial. Well known norm equivalences yield O(logn) bounds for any of the usually 
considered matrix norms. 



3 Proof of the Main Result 

3.1 A useful characterization 

Write V = ]^ mn + m + n f or the space of data inputs, and 

B = {Bc{l,2,...,n}||B| =m} 

for the family of possible bases. 

For any B G B and any triple d G V, let S\ be the set of all m by m submatrices 
of [Ab, b], S2 the set of all m + 1 by m + 1 submatrices of (A T , c) containing As, 
and Sb(cL) = S\ IJ^- Note that |<Si| = m + 1 and |<S*2 1 = n — m, so Sb(cI) has n + 1 
elements. 

Let Sing be the set of singular matrices. For any square matrix S, we define the 
distance to singularity as follows. 

PSing(S) := min{||<55|| : (S + 5S) G Sing}. 

For any B G B consider the function 

h B :V -»• [0,+oo) 

d * s^( d) PS ^ S) - 

The following characterization of g(d) is Theorem 2 in [4]. 

Theorem 2 For any d £U, 

g(d) = h B (d) 

where B is the optimal basis of d. □ 
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3.2 The group action 

We consider the group (with respect to componentwise multiplication) & n — 
{—1,1}™. This group acts on V as follows. For u G n let D u be the diagonal 
matrix having Uj as its jth diagonal entry, and 

u(A) := AD U = (uiai,u 2 a 2 ,...,u n a n ), 
u(c) := D u c= (uici,u 2 c 2 ,...,u n c n ), 

where cij denotes the ith column of A. We define u(d) := (u(A),b, u(c)). The group 
C5 n also acts on M n by u(x) := (uixi, . . . , u n x n ). It is immediate to verify that for 
all A G R mxn , all x G W 1 , and all u G <5 n we have u(A)u(x) = Ax. 

Lemma 1 The functions Kb are & n -invariant. That is, for any d G V, B G B and 

U G & n , 

h B {d) = h B {u{d)). 
Proof. Let S* be any matrix in Sn(d) such that 

PSin g (S*)= min PSing(S)- (!) 

Let /c be the number of rows (or columns) of S* and E be any matrix in R fcxfc such 
that S* + E £ Sing and 

||£|| = PSin g (S*)- ( 2 ) 
Then, there exists z£l fc such that 

(S*+£)z = 0. (3) 

Suppose S 1 * consists of the ii,J 2 ,...,Jfc columns of and let u = 

Uj 2 , . . . , Uj k ) G (5^. Then, by the definition of iSb(gQ and <Sb(u(c#)), we have 
u(S*) G Sb{u((T)). Furthermore, 

(u(S*) + u(£))u(z) = u(5* + £7)u(z) = (5* + £)(z) = 0, 

the last by Equation (3). That is, (u(S 1 *) + u(E)) is also singular. By the definition 

Of PSing, 

PSinMS*)) < \HE)l (4) 

Since operator norms are invariant under multiplication of arbitrary matrix columns 
by —1 we have \\E\\ = \\u(E)\\. Combining this equality with Equations (1), (2), 
and (4) we obtain 

PSingK-S*)) < min psing(S). 

b£S B (d) 

Since u(5*) G 5s(u(d)) we obtain 

^mjn /Sin g (5) < min psing(S). 

SeS B (u(d)) SeS B (d) 
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The reversed inequality follows by exchanging the roles of S(u) and S. □ 
For any B G B, let 

Ub = {d G X> | B is the only optimal basis for <i}. 

The set U of well-posed feasible triples is thus partitioned by the sets {Ub \ B £ B}. 

Lemma 2 Let d £ V and B G B. If hs{d) > 0, then there exists a unique u 6 & n 
such that u(d) GUb- 

Proof. First observe that, since mms£s B (d) PS'mg(S) > 0, we have Ab invertible 
and therefore B is a basis for A. Let y* and x* be the dual and primal basic solutions 
of d for the basis B, i.e. 

y* = A B T c B , x* B = A B 1 b, x* = 0,Vj?B. (5) 

Similarly, let y u and x u be the dual and primal basic solutions of u(d) for the same 
basis. Then, using that u(A) = AD U and u(c) = D u c, 

y u = u(A) B T u(c) B = Ab T (D u )b T (D u ) b c b = A B T c B = y* (6) 

the third equality by the definition of (D u ) b ■ Similarly, 

x u B = u^ 1 b = (D^Az 1 b = {D U ) B A- B l b = (D u ) B x B (7) 

and x" = for all j G" B. Therefore, 

B is optimal for u(d) 44> x u and y u are both feasible 



x B >0 



u(A)Jy u < u(c)j, for j 5 



UjX* > 0, for j £ B 
Ujicj-ajy) > C 

Since by hypothesis min SeSB ( d) ps\ ng (S) > 0, 



44> < — (8) 

1 u j (c j -ajy)>0, fori 5. 



x*/0, \/j£B and ajy^Cj,Vj (9) 



Combining Equations (8) and (9), the statement follows for u G n given by 
Uj = sign (a;*) if j £ B and Uj = sign(c-,- — ajy) otherwise. Clearly, this u is unique. 



□ 

For B £ B let 



S B := |d G D | /» B (d) = o} 



and := P \ Lemma 1 implies that, for all B G B, and Pb are *5 n - 
invariant. Lemma 2 immediately implies the following corollary. 
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Corollary 1 For all B G B the sets 

V u := {d G T>b | u(d) G ^s}, for u G n 
are a partition ofT>B- □ 

3.3 Probabilities 

Definition 2 We say that a distribution *3 on the set of triples d = {A, b, c) is 

<5 n -invariant when 

(i) if d ~ ^ then u(d) ~ for all u G ©„. 

(ii) for all BeB, Prob{h B (d) = 0} = 0. 

Note that Gaussianity is a special case of (5 n -invariance. Consequently, all results 
true for a © ra -invariant distribution also hold for Gaussian data. 

Note: For a time to come we fix a <3 n -invariant distribution @ with density 
function /. 

Lemma 3 For any u G <3 n and B G B, 

Prob{u(d) G U B } = Prob{d G U B ] = ^. 

Proof. The equality between probabilities follows from (i) in Definition 2. 
Therefore, by Corollary 1 and Definition 2 (ii) , the probability of each of them is 
2~ n . □ 

The following lemma tells us that, for all B G £>, the random variable hs(d) is 
independent of the event "d G Ub" 

Lemma 4 For all measurable g : R — > R and B *E B, 

E ( 5 (/i B (d))|dGW B ) = E (</(Md)))- 

d^Q) d^y 

Proof. From the definition of conditional expectation and Lemma 3 we have 



g(h B (d))f(d) 



where 1^ denotes the indicator function oi Ub- Now, for any u G <5 n , the map 
d i— >■ u (d) is a linear isometry on P. Therefore 

/ l B (dMMd))/(d) = / lB(u(d)M/» B (u(d)))/(u(d)). 
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Using that h B (d) = ^b(u(c2)) (by Lemma 1) and f(d) = f(u(d)) (by the <S n - 
invariance of 3>), it follows that 

E (g(h B (d)) | d G U B ) = 2 n I l B (d)g(h B (d))f(d) 

= E / lB(u(d)M/l B (u(d)))/(u(d)) 

= E / lB(u(d))^(/iB(d))/(d) 

= [ g(h B {d))f(d) = E(g(h B (d))), 

the last line by Corollary 1. □ 
Let B* = {l,2,...,m}. 

Lemma 5 For all measurable g : R — >■ R 

E (g(Q{d)) \deU)= E (g(h B *(d))). 

d~JV(0,ld) V V V d~JV(0,ld) V v 

PROOF. Let be the probability density function of iV(0, Id). 

e (^(g(d))|d€ZY) = ; m ; . (ii) 

d~JV(o,id) m v 771 ; Prob {d£U} K 1 

d~JV(0,ld) 

Since d is Gaussian, the probability that d has two optimal bases is 0. Using this 
and Lemma 3 we see that 

Prob {deU}=Y Prob {deU B } = Y — = ( " ^ f — ^ . (12) 
Combining Equations (11) and (12), we have 



n 
m 



±) E (g(e(d))\d€U) = [ g(e(d)Md)d(d) 

= E / g(e(d)Md)d(d) 



the last since the probability that d has two optimal bases is 0. Using now that the 
entries of d are i.i.d. and Theorem 2 we obtain 

(^) „ J„ 1 d G W) = / ^(e(d))v(d)d(d) 

V 2 / rf~A^(o,id) Jdeu B , 

= [ g{h B *(d))<p(d)d(d). 
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Therefore, by Lemma 3 with B = B*, 

, P ^J deUB *h 3 n J 9 ^ \ deU ) = I g{h B *(d))^{d)6{d). 

d~AT(0,ld) d~jV(0,ld) JdeU B , 

We conclude since, by the definition of conditional expectation and Lemma 4, 
E (g(g(d))\deU) = E (g(h B * (d)) \ d G U B A 

= E (g(h B *(d))). □ 

The following is Lemma 11 in [4]. 
Lemma 6 For the \\ \\u in the definition of psing we have 



E [ —L-\<2m^ 

S~AT(0,ld) \^ PSing(^) / 

where iV(0, Id) is the Gaussian distribution in the set of m x m real matrices. 
Lemma 7 Let B G B fixed. Then, for the || ||i2 in the definition of ps\ n g we have 



E ( \It~Ta\ ) <2(m + l) 5/4 (n+l). 



Proof. For any fixed d G V, 



E\ / ~77n > max 



ses B 



PSmg(S) SeS B V PSing('S') \ h B {d) 



Take average on both sides, 



E 



< ^ 2(m + 1) 5/4 by Lemma 6 
SeS B 

< 2(m + l) 5/4 (n + 1). □ 



The following lemma is proved as Lemma 4. 
Lemma 8 For all r, s > 1 we have 



E (||d||„|d€W)= E (||d||„). □ 

d~9 d^9 



Lemma 9 We have 

E (||d||i 2 ) < 6Vn+T. 

d~AT(0,ld) 

Proof. Recall that ||d||i 2 = ||M d ||i 2 . It is well known that ||M d ||i 2 < ||M d || 
where the latter is spectral norm. The statement now follows from the fact that, for 
a random Gaussian A G R( n »+ 1 ) x ( n + 1 ) we have E(||A||) < §\fn + \ [1, Lemma 2.4]. 

□ 

Proof of Theorem 1. By Jensen's inequality and Lemma 9, 

E(ln||d||i 2 ) <lnE(||d|| 12 ) < \ ln(n + 1) + In 6. (13) 

d d Z 

In addition, using now Lemma 7, 



> -ln(2(m+l)t(n + l)). (14) 
By the definition of /C(d) and Lemmas 8 and 5, 

E(\nlC{d)\ deU) = E (In ||d||i 2 | d G W) - E {\ng(d)\ d G W) 

d (id 

= E(ln||d|| 12 )-E(ln(^(d))). (15) 

a a 

Combining Equations (13), (14), and (15), the proof is done. □ 
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